Refine your search
Collections
Co-Authors
Year
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Mustapha, Aida
- An Experimental Study of Classification Algorithms for Crime Prediction
Abstract Views :692 |
PDF Views:279
Authors
Rizwan Iqbal
1,
Masrah Azrifah Azmi Murad
1,
Aida Mustapha
1,
Payam Hassany Shariat Panahy
1,
Nasim Khanahmadliravi
1
Affiliations
1 Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, MY
1 Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, MY
Source
Indian Journal of Science and Technology, Vol 6, No 3 (2013), Pagination: 4219-4225Abstract
Classification is a well-known supervised learning technique in data mining. It is used to extract meaningful information from large datasets and can be effectively used for predicting unknown classes. In this research, classification is applied to a crime dataset to predict 'Crime Category' for different states of the United States of America. The crime dataset used in this research is real in nature, it was collected from socio-economic data from 1990 US Census, law enforcement data from the 1990 US LEMAS survey, and crime data from the 1995 FBI UCR. This paper compares the two different classification algorithms namely, Naïve Bayesian and Decision Tree for predicting 'Crime Category' for different states in USA. The results from the experiment showed that, Decision Tree algorithm out performed Naïve Bayesian algorithm and achieved 83.9519% Accuracy in predicting 'Crime Category' for different states of USA.Keywords
Crime Prediction, Crime Category, AlgorithmReferences
- Batchu V, Aravindhar D J et al., (2011). A classifica-tion based dependent approach for suppressing data, IJCA Proceedings on Wireless Information Networks & Business Information System (WINBIS 2012), Foundation of Computer Science (FCS).
- Cios J, Pedrycz W et al., (1998). Data Mining in Knowledge Discovery, Academic Publishers.
- Geenen P. L, van der Gaag L C et al., (2011). Constructing naive Bayesian classifiers for veterinary medicine: A case study in the clinical diagnosis of classical swine fever, Research in Veterinary Science, vol 91(1), 64–70.
- Hamou A, Simmons A et al., (2011). Cluster analysis of MR imaging in Alzheimer’s disease using decision tree refinement. International Journal of Artificial Intelligence, vol 6(S11), 90–99.
- Han J, and Kamber M (2006). Data mining: concepts and techniques, Morgan Kaufmann Publishers, San Francisco, CA.
- Kochar B, and Chhillar R (2012). An Effective Data Warehousing System for RFID using Novel Data Cleaning, Data Transformation and Loading Techniques. Arab Journal of Information Technology, vol 9(3), 208–216
- Kováč S (2012). Suitability analysis of data mining tools and methods, Bachelor’s Thesis.
- Kumar V, and Rathee N (2011). Knowledge discovery from database using an integration of clustering and classification, International Journal of Advanced Computer Science and Applications, vol 2(3), 29–32.
- Li G, and Wang Y (2012). A privacy-preserving classification method based on singular value decomposition, Arab Journal of Information Technology, vol 9(6), 529–534.
- Ngai E W T, Xiu L et al., (2009). Application of data mining techniques in customer relationship management: A literature review and classification, Expert Systems with Applications, vol 36(2), 2592–2602.
- Santhi P, and Bhaskaran V. M (2010). Performance of clustering algorithms in healthcare database, International Journal for Advances in Computer Science, vol 2(1), 26–31.
- Selvaraj S, and Natarajan J (2011). Microarray data analysis and mining tools, Bioinformation, vol 6(3), 95–99.
- UCI Machine Learning Repository (2012). Available from: http://archive.ics.uci.edu/ml/datasets.html
- Wahbeh A H, Al-Radaideh Q A, et al., (2011). A comparison study between data mining tools over some classification methods, International Journal of Advanced Computer Science and Applications, Special Issue, 18–26.
- WikiPedia (2012). WIKIPEDIA, Available from: http://en.wikipedia.org/wiki/List_of_U.S._states_by_area.
- Witten I, Frank E et al. (2011), Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann.
- Xu Y, Dong Z. Y et al., (2011) A decision tree-based on-line preventive control strategy for power system transient instability prevention, International Journal of Systems Science.
- A Framework to Construct Data Quality Dimensions Relationships
Abstract Views :710 |
PDF Views:0
Authors
Payam Hassany Shariat Panahy
1,
Fatimah Sidi
1,
Lilly Suriani Affendey
1,
Marzanah A. Jabar
1,
Hamidah Ibrahim
1,
Aida Mustapha
1
Affiliations
1 Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, Selangor, 43400, MY
1 Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, Selangor, 43400, MY
Source
Indian Journal of Science and Technology, Vol 6, No 5 (2013), Pagination: 4422-4431Abstract
Data and information obtained from data analysis is an essential asset to construct and support information systems. As data is a significant resource, the quality of data is critical to enhance data quality and increase the effectiveness of business processes. Relationships among all four major data quality dimensions for process improvement are often neglected. For this reason, this study proposes to construct a reliable framework to support process activities in information systems. This study focuses on four critical quality dimensions; accuracy, completeness, consistency, and timeliness. A qualitative approach was conducted using a questionnaire and the responses were assessed to measure reliability and validity of the survey. Factor analysis and Cronbach-alpha test were applied to interpret the results. The results show that the items of each data quality dimension and improvement process are reliable and valid. This framework can be used to evaluate data quality in an information system to improve the involved process.Keywords
Data Quality Dimension, Framework, Relationship, Validation, Information Systems, Factor AnalyzingReferences
- Al-Hakim L (2007). Information quality management: theory and applications, Igi Global. Idea Group Publishing, Hershy, USA, London, UK, 119-144.
- Alizamini F G, Pedram M M et al. (2010). Data quality improvement using fuzzy association rules, Electronics and Information Engineering (ICEIE), International Conference On Electronics and Information Engineering (ICEIE), 2010, V1-468-V1-472. [doi:10.1109/ICEIE.2010.5559676]
- Ballou D P, and Pazer H L (1985). Modeling data and process quality in multi-input, multi-output information systems, Management Science, vol 31(2), 150-162.
- Barone D, Stella F et al. ( 2010). Dependency discovery in data quality, Proceedings of the 22nd international conference on Advanced information systems engineering (CAiSE'10), Pernici B (Ed.), Springer-Verlag, Berlin, Heidelberg, 53-67.
- Batini C, Cappiello C et al. (2009). Methodologies for data quality assessment and improvement, ACM Computing Surveys, vol 41(3), 1-52, doi:[10.1145/1541880.1541883].
- Calvo-Manzano J A, Cuevas G et al. (2012). Methodology for process improvement through basic components and focusing on the resistance to change, Journal of Software: Evolution and Process, vol 24(5), 511-523, doi:10.1002/smr.505.
- Berner E S, Kasiraman R K et al. (2005). Data quality in the outpatient setting: impact on clinical decision support systems, AMIA Annual Symposium Proceedings, vol 2005, 41-45, American Medical Informatics Association, [PMCID: PMC1560426].
- Bovee M, Srivastava R P et al. (2003). A conceptual framework and belief-function approach to assessing overall information quality, International Journal of Intelligent Systems, vol 18(1), 51-74.
- Carey M J, Ceri S et al. (2006). Data-Centric Systems and Applications, Springer, Verlag Berlin Heidelberg, [doi: 10.1007/978-3-540-76452-6].
- Creswell J W (2009). Research design: Qualitative, quantitative, and mixed methods approaches, Chapter 1, 2nd Edn., London: Sage Publications, Inc, 14.
- De Amicis F, Barone D et al. (2006). An analytical framework to analyze dependencies among data quality dimensions, Proceedings of the 11th International Conference on Information Quality (ICIQ), Cambridge, MA, USA, 369-383.
- Eckerson W (2002). Data Warehousing Special Report: Data quality and the bottom line, Applications Development Trends.
- English L P (1999). Seven deadly misconceptions about information quality, Information Impact International, Inc., Tennessee, Brentwood, 1-8.
- Fisher C, Eitel L E et al. (2012). Introduction to information quality, AuthorHouse, USA , 126.
- Gackowski Z J ( 2005). Informing systems in business environments: A purpose-focused view, Informing Science Journal, vol 8, 101-122.
- Heinrich B, Kaiser M et al. (2007). How to measure data quality? A metric-based approach, Twenty Eighth International Conference on Information Systems, Montreal, 101-122.
- Heravizadeh M, Mendling J et al. (2009). Dimensions of business processes quality (QoBP), Business Process Management Workshops, Springer, Berlin Heidelberg, vol 17, 80-91.
- Huang H, Stvilia B et al. (2012). Prioritization of data quality dimensions and skills requirements in Genome annotation work, Journal of the American Society for Information Science and Technology, vol 63(1), 195-207, doi:10.1002/asi.21652.
- Jarke M, Lenzerini M et al. (2003). Fundamentals of Data Warehouses, SIGMOD record, Springer-Verlag, vol 32(2), 55-56. [ISBN: 3-540-42089-4].
- Katerattanakul P, and Siau K (1999). Measuring information quality of web sites: development of an instrument, Proceedings ICIS’99 of the 20th international conference on Information Systems, Charlotte, North Carolina, United States, Association for Information Systems, Atlanta, GA, USA, 279-285.
- Lee Y W, Strong D M et al. (2002). AIMQ: a methodology for information quality assessment, Information & Management, vol 40(2), 133-146, doi:10.1016/S0378-7206(02)00043-5.
- Leech N L, Barrett K C et al. (2008). Statics, spss For Intermediate, 3rd Edn., New York, London: Lawrence Erlbaum Associates Inc. Publishers, New Jersey, 58.
- Li Y, and Osei-Bryson K-M (2010). Quality factory and quality notification service in data warehouse. Proceedings of the 3rd workshop on Ph.D. students in information and knowledge management - PIKM ’10, New York, New York, USA: ACM Press, 25-32. doi:[10.1145/1871902.1871907].
- Lotfi Z, Shahnorbanun S et al. (2013). A Product Quality- Supply Chain Integration Framework, Journal of Applied Sciences, vol 13(1), 36-48, doi:[10.3923/jas.2013.36.48].
- Liu L, and Chi L N (2002). Evolutional data quality: A theory-specific view, Proceedings of 7th International Conference on Information Quality (ICIQ 2002), Cambridge, Boston,MA, 292-304.
- Madnick S E, Wang R Y et al. (2009). Overview and Framework for data and information quality research, Journal of Data and Information Quality (JDIQ), vol 1(1), 1-22. [doi:10.1145/1515693.1516680].
- Maguire H (2007). Book Review: Data Quality: Concepts, Methodologies and Techniques, Batini C, Scannapieco M, International Journal of Information Quality, Springer, vol 1(4), 444-450. [ ISBN: 13 978-3-540-33172-8].
- McGilvray D (2008). Executing data quality projects: Ten steps to quality data and trusted information. Morgan Kaufmann, Elsevier, Barlington, MA, USA. [ISBN: 978-0-12-374369-5].
- Milano D, Scannapieco M et al. (2006). Design and Implementation of a Peer-to-Peer Data Quality Broker, Interoperability of Enterprise Software and Applications, Springer, London, 289-300, [DOI: 10.1007/1-84628-152-0_26].
- Naumann F (2002). Quality-driven query answering for integrated information systems, Lecture Notes in Computer Science, Springer, Verlag Berlin Heidelberg, vol 2261, [ISBN 978-3-540-43349-1].
- Redman T C, and Blanton A (1997). Data quality for the information age, 1st Edn., ACM Digital Library, Artech House, Inc., Norwood, MA, USA, [ISBN:0890068836].
- Sadeghi A, and Clayton R (2000). The quality vs . timeliness tradeoffs in the BLS ES-202 administrative statistics, Federal Committee on Statistical Methodology, 1-7.
- Scannapieco M, Missier P et al. (2005). Data Quality at a Glance, Datenbank-Spektrum, Citeseer, vol 14, 6-14.
- Sidi F, Shariat Panahy P H et al. (2012). Data quality: A survey of data quality dimensions, 2012 International Conference on Information Retrieval and Knowledge Management, 300-304, doi:[10.1109/InfRKM.2012.6204995].
- Shariat Panahy P H, Sidi F et al. (2012). Discovering Dependencies among Data Quality Dimensions: A Validation of Instrument, Journal of Applied Sciences, In press.
- Strong D M, Lee Y W et al. (1997). 10 Potholes in the road to information quality, Computer, vol 30(8), 38-46 [doi:10.1109/2.607057].
- Strong D M, Lee Y W et al. (1997). Data quality in context, Communications of the ACM, vol 40(5), 103-110, [doi:10.1145/253769.253804].
- Tee S W, Bowen P L et al. (2007). Factors influencing organizations to improve data quality in their information systems, Accounting and Finance, vol 47(2), 335-355. doi:[10.1111/j.1467-629X.2006.00205.x].
- Wang R Y, and Strong D M (1996). Beyond accuracy: What data quality means to data consumers, Journal of management information systems, vol 12(4), 5-33.
- Wang K Q, Tong S R et al. (2008). Analysis of data quality and information quality problems in digital manufacturing. 2008 4th IEEE International Conference on Computing & Processing (Hardware/Software); Engineering Profession, Management of Innovation and Technology, 2008, ICMIT, IEEE, Bangkok, Thailand, 439-443, [doi:10.1109/ICMIT.2008.4654405].
- Wei-Liang C, Shi-Dong Z et al. (2009). Anchoring the Consistency Dimension of Data Quality Using Ontology in Data Integration. 2009 Sixth Web Information Systems and Applications Conference, 201-205, doi:[10.1109/WISA.2009.32]
- Wand Y W (1996). Anchoring data quality dimensions in ontological foundation, Communication of the ACM, vol 39(11), 86-95.
- Christy S, Rajakumari S B et al. (2010). Quality data representation in web portal—A case study, Trendz in Information Sciences & Computing (TISC),IEEE, 230-232.
- Heinrich B, Kaiser M et al. (2007). How to measure data quality? - a metric based approach by. Twenty Eighth International Conference on Information, Montreal, Canada. Systems, vol 4801,101-122.
- Nunnally J C, and Bernstein I (1994). Psychometric theory, 3rd Edn., McGraw_Hill Inc: New York.